{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Qwen聊天机器人\n",
    "通义千问Qwen是阿里云自主研发的大语言模型，Qwen-1.5-0.5B 为其轻量版，拥有5亿参数，具备低成本和高效文本生成能力。适用于文字创作、文本处理、编程辅助、翻译服务和对话模拟等场景。\n",
    "\n",
    "在本次任务中，我们将基于昇思MindSpore在香橙派开发板上运行通义千问Qwen-1.5-0.5B模型，体验和模型的对话互动，完成和聊天机器人对话。具体操作如下："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 环境准备\n",
    "\n",
    "开发者拿到香橙派开发板后，首先需要进行硬件资源确认，镜像烧录及CANN和MindSpore版本的升级，才可运行该案例，具体如下：\n",
    "\n",
    "- 硬件： 香橙派AIpro 16G 8-12T开发板\n",
    "- 镜像： 香橙派官网ubuntu镜像\n",
    "- CANN：8.0.RC3.alpha002\n",
    "- MindSpore： 2.4.10\n",
    "\n",
    "### 镜像烧录\n",
    "\n",
    "运行该案例需要烧录香橙派官网ubuntu镜像，烧录流程参考[昇思MindSpore官网--香橙派开发专区--环境搭建指南--镜像烧录](https://www.mindspore.cn/docs/zh-CN/r2.4.10/orange_pi/environment_setup.html#1-%E9%95%9C%E5%83%8F%E7%83%A7%E5%BD%95%E4%BB%A5windows%E7%B3%BB%E7%BB%9F%E4%B8%BA%E4%BE%8B)章节。\n",
    "\n",
    "### CANN升级\n",
    "\n",
    "CANN升级参考[昇思MindSpore官网--香橙派开发专区--环境搭建指南--CANN升级](https://www.mindspore.cn/docs/zh-CN/r2.4.10/orange_pi/environment_setup.html#3-cann%E5%8D%87%E7%BA%A7)章节。\n",
    "\n",
    "### MindSpore升级\n",
    "\n",
    "MindSpore升级参考[昇思MindSpore官网--香橙派开发专区--环境搭建指南--MindSpore升级](https://www.mindspore.cn/docs/zh-CN/r2.4.10/orange_pi/environment_setup.html#4-mindspore%E5%8D%87%E7%BA%A7)章节。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 权重加载\n",
    "\n",
    "此处使用mindnlp套件加载Qwen/Qwen1.5-0.5B-Chat模型权重，该套件包含了许多自然语言处理的常用方法，可以方便地加载和使用huggingface的模型权重。注意，首次运行时请耐心等待模型下载。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#install mindnlp\n",
    "\n",
    "!pip install mindnlp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/HwHiAiUser/.conda/envs/mindspore_py39/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.\n",
      "  setattr(self, word, getattr(machar, word).flat[0])\n",
      "/home/HwHiAiUser/.conda/envs/mindspore_py39/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.\n",
      "  return self._float_to_str(self.smallest_subnormal)\n",
      "/home/HwHiAiUser/.conda/envs/mindspore_py39/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.\n",
      "  setattr(self, word, getattr(machar, word).flat[0])\n",
      "/home/HwHiAiUser/.conda/envs/mindspore_py39/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.\n",
      "  return self._float_to_str(self.smallest_subnormal)\n",
      "Building prefix dict from the default dictionary ...\n",
      "Loading model from cache /tmp/jieba.cache\n",
      "Loading model cost 2.232 seconds.\n",
      "Prefix dict has been built successfully.\n",
      "Qwen2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`.`PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.\n",
      "  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).\n",
      "  - If you are not the owner of the model architecture class, please contact the model code owner to update it.\n",
      "Sliding Window Attention is enabled but not implemented for `eager`; unexpected results may be encountered.\n"
     ]
    }
   ],
   "source": [
    "import mindspore\n",
    "from mindnlp.transformers import AutoModelForCausalLM, AutoTokenizer\n",
    "\n",
    "# Loading the tokenizer and model from Hugging Face's model hub.\n",
    "tokenizer = AutoTokenizer.from_pretrained(\"Qwen/Qwen1.5-0.5B-Chat\", ms_dtype=mindspore.float16)\n",
    "model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen1.5-0.5B-Chat\", ms_dtype=mindspore.float16)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型推理\n",
    "\n",
    "模型加载完成后定义聊天机器人所需处理的事务：加载与处理聊天历史，转换成适合模型输入的格式；并通过流式生成器的推理模式逐步生成聊天机器人的回复。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "outputs": [],
   "source": [
    "from mindnlp.transformers import TextIteratorStreamer\n",
    "from threading import Thread\n",
    "\n",
    "system_prompt = \"You are a helpful and friendly chatbot\"\n",
    "\n",
    "def build_input_from_chat_history(chat_history, msg: str):\n",
    "    messages = [{'role': 'system', 'content': system_prompt}]\n",
    "    for user_msg, ai_msg in chat_history:\n",
    "        messages.append({'role': 'user', 'content': user_msg})\n",
    "        messages.append({'role': 'assistant', 'content': ai_msg})\n",
    "    messages.append({'role': 'user', 'content': msg})\n",
    "    return messages\n",
    "\n",
    "# Function to generate model predictions.\n",
    "def predict(message, history):\n",
    "    # Formatting the input for the model.\n",
    "    messages = build_input_from_chat_history(history, message)\n",
    "    input_ids = tokenizer.apply_chat_template(\n",
    "            messages,\n",
    "            add_generation_prompt=True,\n",
    "            return_tensors=\"ms\",\n",
    "            tokenize=True\n",
    "        )\n",
    "    streamer = TextIteratorStreamer(tokenizer, timeout=300, skip_prompt=True, skip_special_tokens=True)\n",
    "    generate_kwargs = dict(\n",
    "        input_ids=input_ids,\n",
    "        streamer=streamer,\n",
    "        max_new_tokens=1024,\n",
    "        do_sample=True,\n",
    "        top_p=0.9,\n",
    "        temperature=0.1,\n",
    "        num_beams=1,\n",
    "    )\n",
    "    t = Thread(target=model.generate, kwargs=generate_kwargs)\n",
    "    t.start()  # Starting the generation in a separate thread.\n",
    "    partial_message = \"\"\n",
    "    for new_token in streamer:\n",
    "        partial_message += new_token\n",
    "        if '</s>' in partial_message:  # Breaking the loop if the stop token is generated.\n",
    "            break\n",
    "        yield partial_message"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 启动聊天机器人\n",
    "\n",
    "下面我们将启动一个基于Gradio的聊天界面，用于与聊天机器人进行交互。在浏览器中打开链接 [http://127.0.0.1:7860](http://127.0.0.1:7860)，开始与聊天机器人的交互。您可以在页面下方的消息输入框中输入任何问题，或者点击页面下方 **Examples** 中预设的问题，然后点击 **Submit** 按钮与聊天机器人进行对话。注意，第一次回答需要较长时间的加载（约1~2分钟），请耐心等待。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#install gradio\n",
    "\n",
    "!pip install gradio==4.44.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Running on local URL:  http://127.0.0.1:7860\n",
      "\n",
      "To create a public link, set `share=True` in `launch()`.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/HwHiAiUser/.local/lib/python3.9/site-packages/gradio/analytics.py:106: UserWarning: IMPORTANT: You are using gradio version 4.44.0, however version 4.44.1 is available, please upgrade. \n",
      "--------\n",
      "  warnings.warn(\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div><iframe src=\"http://127.0.0.1:7860/\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": []
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import gradio as gr\n",
    "# Setting up the Gradio chat interface.\n",
    "gr.ChatInterface(predict,\n",
    "                 title=\"Qwen1.5-0.5b-Chat\",\n",
    "                 description=\"问几个问题\",\n",
    "                 examples=['你是谁？', '介绍一下华为公司']\n",
    "                 ).launch()  # Launching the web interface."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本案例已同步上线 [GitHub 仓](https://github.com/mindspore-courses/orange-pi-mindspore/tree/master/Online/14-qwen1.5-0.5b)，更多案例开发亦可参考该仓库。\n",
    "\n",
    "本案例运行所需环境：\n",
    "\n",
    "- **硬件**：香橙派 AIpro 16G 8-12T 开发板\n",
    "- **镜像**：香橙派官网 Ubuntu 镜像\n",
    "- **CANN**：8.0.RC3.alpha002\n",
    "- **MindSpore**：2.4.10"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
